59 research outputs found

    Simulation, models, and refactoring of bacteriophage T7 gene expression

    Get PDF
    Thesis (Sc. D.)--Massachusetts Institute of Technology, Biological Engineering Division, February 2007.Includes bibliographical references (leaves 108-124).Our understanding of why biological systems are designed in a particular way would benefit from biophysically-realistic models that can make accurate predictions on the time-evolution of molecular events given arbitrary arrangements of genetic components. This thesis is focused on constructing such models for gene expression during bacteriophage T7 infection. T7 gene expression is a particularly well suited model system because knowledge of how the phage functions is thought to be relatively complete. My work focuses on two questions in particular. First, can we address deficiencies in past simulations and measurements of bacteriophage T7 to improve models of gene expression? Second, can we design and build refactored surrogates of T7 that are easier to understand and model? To address deficiencies in past simulations and measurements, I developed a new single-molecule, base-pair-resolved gene expression simulator named Tabasco that can faithfully represent mechanisms thought to govern phage gene expression. I used Tabasco to construct a model of T7 gene expression that encodes our mechanistic understanding. The model displayed significant discrepancies from new system-wide measurements of absolute T7 mRNA levels during infection.(cont.) I fit transcript-specific degradation rates to match the measured RNA levels and as a result corrected discrepancies in protein synthesis rates that confounded previous models. I also developed and used a fitting procedure to the data that let us evaluate assumptions related to promoter strengths, mRNA degradation, and polymerase interactions. To construct surrogates of T7 that are easier to understand and model, I began the process of refactoring the T7 genome to construct an organism that is a more direct representation of the models that we build. In other words, instead of making our models evermore detailed to explain wild-type T7, we started to construct new phage that are more direct representations of our models. The goal of our original design, T7. 1, was to physically define, separate, and enable unique manipulation of primary genetic elements. To test our initial design, we replaced the left 11,515 bp of the wild-type genome with 12,179 bp of engineered DNA. The resulting chimeric genome encodes a viable bacteriophage that appears to maintain key features of the original while being simpler to model and easier to manipulate. I also present a second generation design, T7.2, that extends the original goals of T7.1 by constructing a more direct physical representation of the T7 model.by Sriram Kosuri.Sc.D

    Simulation, Models, and Refactoring of Bacteriophage T7

    Get PDF
    Our understanding of why biological systems are designed in a particular way would benefit from biophysically-realistic models that can make accurate predictions on the time-evolution of molecular events given arbitrary arrangements of genetic components. This thesis is focused on constructing such models for gene expression during bacteriophage T7 infection. T7 gene expression is a particularly well suited model system because knowledge of how the phage functions is thought to be relatively complete. My work focuses on two questions in particular. First, can we address deficiencies in past simulations and measurements of bacteriophage T7 to improve models of gene expression? Second, can we design and build refactored surrogates of T7 that are easier to understand and model? To address deficiencies in past simulations and measurements, I developed a new single-molecule, base-pair-resolved gene expression simulator named Tabasco that can faithfully represent mechanisms thought to govern phage gene expression. I used Tabasco to construct a model of T7 gene expression that encodes our mechanistic understanding. The model displayed significant discrepancies from new system-wide measurements of absolute T7 mRNA levels during infection. I fit transcript-specific degradation rates to match the measured RNA levels and as a result corrected discrepancies in protein synthesis rates that confounded previous models. I also developed and used a fitting procedure to the data that let us evaluate assumptions related to promoter strengths, mRNA degradation, and polymerase interactions. To construct surrogates of T7 that are easier to understand and model, I began the process of refactoring the T7 genome to construct an organism that is a more direct representation of the models that we build. In other words, instead of making our models evermore detailed to explain wild-type T7, we started to construct new phage that are more direct representations of our models. The goal of our original design, T7.1, was to physically define, separate, and enable unique manipulation of primary genetic elements. To test our initial design, we replaced the left 11,515 bp of the wild-type genome with 12,179 bp of engineered DNA. The resulting chimeric genome encodes a viable bacteriophage that appears to maintain key features of the original while being simpler to model and easier to manipulate. I also present a second generation design, T7.2, that extends the original goals of T7.1 by constructing a more direct physical representation of the T7 model

    GeneJax: A Prototype CAD tool in support of Genome Refactoring

    Get PDF
    Refactoring is a technique used by computer scientists for improving program design. The Endy Laboratory has adapted this process to make the genomes of biological organisms more amenable to human understanding and design goals. To assist in this endeavor, we implemented GeneJax, a prototype JavaScript web application for the dissection and visualization stages of the genome refactoring process. This paper reviews key genome refactoring concepts and then discusses the features, development history, user-interface, and underlying implementation issues faced during the making of GeneJax. In addition, we provide recommendations for future GeneJax development. This paper may be of interest to engineers of CAD tools for synthetic biology

    TABASCO: A single molecule, base-pair resolved gene expression simulator

    Get PDF
    BACKGROUND: Experimental studies of gene expression have identified some of the individual molecular components and elementary reactions that comprise and control cellular behavior. Given our current understanding of gene expression, and the goals of biotechnology research, both scientists and engineers would benefit from detailed simulators that can explicitly compute genome-wide expression levels as a function of individual molecular events, including the activities and interactions of molecules on DNA at single base pair resolution. However, for practical reasons including computational tractability, available simulators have not been able to represent genome-scale models of gene expression at this level of detail. RESULTS: Here we develop a simulator, TABASCO , which enables the precise representation of individual molecules and events in gene expression for genome-scale systems. We use a single molecule computational engine to track individual molecules interacting with and along nucleic acid polymers at single base resolution. Tabasco uses logical rules to automatically update and delimit the set of species and reactions that comprise a system during simulation, thereby avoiding the need for a priori specification of all possible combinations of molecules and reaction events. We confirm that single molecule, base-pair resolved simulation using TABASCO (Tabasco) can accurately compute gene expression dynamics and, moving beyond previous simulators, provide for the direct representation of intermolecular events such as polymerase collisions and promoter occlusion. We demonstrate the computational capacity of Tabasco by simulating the entirety of gene expression during bacteriophage T7 infection; for reference, the 39,937 base pair T7 genome encodes 56 genes that are transcribed by two types of RNA polymerases active across 22 promoters. CONCLUSION: Tabasco enables genome-scale simulation of transcription and translation at individual molecule and single base-pair resolution. By directly representing the position and activity of individual molecules on DNA, Tabasco can directly test the effects of detailed molecular processes on system-wide gene expression. Tabasco would also be useful for studying the complex regulatory mechanisms controlling eukaryotic gene expression. The computational engine underlying Tabasco could also be adapted to represent other types of processive systems in which individual reaction events are organized across a single spatial dimension (e.g., polysaccharide synthesis)

    Reliable and accurate diagnostics from highly multiplexed sequencing assays

    Get PDF
    Scalable, inexpensive, and secure testing for SARS-CoV-2 infection is crucial for control of the novel coronavirus pandemic. Recently developed highly multiplexed sequencing assays (HMSAs) that rely on high-throughput sequencing can, in principle, meet these demands, and present promising alternatives to currently used RT-qPCR-based tests. However, reliable analysis, interpretation, and clinical use of HMSAs requires overcoming several computational, statistical and engineering challenges. Using recently acquired experimental data, we present and validate a computational workflow based on kallisto and bustools, that utilizes robust statistical methods and fast, memory efficient algorithms, to quickly, accurately and reliably process high-throughput sequencing data. We show that our workflow is effective at processing data from all recently proposed SARS-CoV-2 sequencing based diagnostic tests, and is generally applicable to any diagnostic HMSA

    Reliable and accurate diagnostics from highly multiplexed sequencing assays

    Get PDF
    Scalable, inexpensive, and secure testing for SARS-CoV-2 infection is crucial for control of the novel coronavirus pandemic. Recently developed highly multiplexed sequencing assays (HMSAs) that rely on high-throughput sequencing can, in principle, meet these demands, and present promising alternatives to currently used RT-qPCR-based tests. However, reliable analysis, interpretation, and clinical use of HMSAs requires overcoming several computational, statistical and engineering challenges. Using recently acquired experimental data, we present and validate a computational workflow based on kallisto and bustools, that utilizes robust statistical methods and fast, memory efficient algorithms, to quickly, accurately and reliably process high-throughput sequencing data. We show that our workflow is effective at processing data from all recently proposed SARS-CoV-2 sequencing based diagnostic tests, and is generally applicable to any diagnostic HMSA

    Multiplexed characterization of rationally designed promoter architectures deconstructs combinatorial logic for IPTG-inducible systems

    Get PDF
    A crucial step towards engineering biological systems is the ability to precisely tune the genetic response to environmental stimuli. In the case of Escherichia coli inducible promoters, our incomplete understanding of the relationship between sequence composition and gene expression hinders our ability to predictably control transcriptional responses. Here, we profile the expression dynamics of 8269 rationally designed, IPTG-inducible promoters that collectively explore the individual and combinatorial effects of RNA polymerase and LacI repressor binding site strengths. We then fit a statistical mechanics model to measured expression that accurately models gene expression and reveals properties of theoretically optimal inducible promoters. Furthermore, we characterize three alternative promoter architectures and show that repositioning binding sites within promoters influences the types of combinatorial effects observed between promoter elements. In total, this approach enables us to deconstruct relationships between inducible promoter elements and discover practical insights for engineering inducible promoters with desirable characteristics
    • …
    corecore